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ERROR CORRECTION 

IN DATA 
TRANSMISSION 

Using Hamming codes to detect and 
correct errors in digital transmissions 



HF communications techniques for 
hams have undergone a dramatic 
change over the past ten years. 
Electro-mechanical ASRs have given way to 
computer terminals, and Baudot has lost 
some of its popularity to ASCII, AMTOR, 
and packet. Data rates have increased from 
the venerable 45 baud to 300. But a "quick" 
packet exchange under less than ideal condi- 
tions proves that error detection and correc- 
tion codes haven't kept up with these 
changes. In fact, packet and AMTOR use 
only error detection codes — requesting 
repeats when errors are found. AMTOR is 
effective, but slow. Under the best of condi- 
tions, its efficiency level is 50 percent. 
Racket, too, can quickly bog down on a 
noisy circuit because of its higher data rate. 

For ordinary chitchat, straight RTTY is 
hard to beat. It's too bad the computer can't 
"fill in the blanks" on a few hits the way a 
human operator can, or can it? 

Actually, your computer can fill in the 
blanks. Hamming codes, developed by Dr. 
Robert Hamming almost 40 years ago, 1 let 
computers detect and correct errors in digi- 
tal transmissions. These codes are used in 
many applications. For instance, Hamming 
codes are used to ensure data integrity in 
the memory portion of the new VHS1C chips. 

I'd like to introduce you to some Hamming 
code basics, and share some look-up tables and 
ideas for future development. I'll also analyze 
how these codes might be used to improve 
packet and AMTOR link performance. 

How they work 

Hamming codes can correct single-bit 



transmission errors. The mathematical process 
involved is quite complicated, so I'll skip the 
theory for now and go directly to an example. 

Say you want to encode a four-bit data 
word into a seven-bit word called a "Hamming 
sequence." This is a 7/4 Hamming code 
(four data bits and three parity bits, for 
total of seven transmitted bits) which can 
correct single errors and detect two-bit 
errors in a received word. To see how this is 
done, pick a number between and 15. Let's 
try 4. In Table 1, the encode table, find the 
number's Hex value (04H). Follow along the 
line to find the opposite value - 100 1100, 
or 4CH, This is the Hamming sequence 
you'll transmit. After you receive that 
sequence, decode it using Table 2 - the 
decode table. The answer is 04H. Now 
simulate noise by changing any one of the 
seven received bits to its opposite value. Try 
making the least significant bit (LSB) a I. 
This makes the received sequence 100 1101, 
or 4DH. Look up this Hamming sequence 
in Table 2, and read the decoded value. 
You'll find it's still 04H. Change any other 
bit, and you'll still obtain the correct value, 
04H, from Table 2. 

Why? The answer is obvious if you look 
at the decode table (Table 2). This table is 
eight times larger than the encode table 
(Table 1), because the decoded value of 04H 
(and all 15 other decoded answers) is decoded 
at eight entries. It's decoded once at the "no 
errors" entry of 100 1100 (marked with the 
asterisk), and seven times at 100 1101, 100 
1110, 100 1000, 100 0100, 101 1100, 110 1100, 
and 000 1100 (all of the seven single-bit 
error positions possible). 
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Now change two bits. Make the sequence 
1 00 1111. Your answer will be 07 H, with 
errors detected. Two-bit errors will always 
give the wrong answer, but will never decode 
as a "no errors" entry marked with an aster- 
isk. Depending on the word, three or four 
bits out of the seven sent would have to be 
changed for that to happen. 

It's a bit harder to create a Hamming 
sequence? When doing so, you need to get 
into the mathematical part of the process. 
But because it doesn't really come into play 
once a look-up table sequence has been 
defined, I'll hold off on the theory once 
again. At this point, I'd rather pique your 
interest with some practical HF applications 
for this encode/decode scheme. 
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Table 1. Encode table for 7/4 Hamming sequences. 
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Table 2. Decode table for 7/4 Hamming sequences. Asterisks indicate "no errors detected." All other entries have single errors 
delected. 
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Implementing the Hamming 
codes 

Why, if they're so easy to implement, 
aren't Hamming codes more popular? First, 
there's no agreed-upon protocol. Second, 
the string of four data bits isn't long 
enough. The Baudot code uses five bits, 
with extra characters (FIGS/LTRS) to shift 
back and forth between two 32-character 
alphabets. ASCII uses seven bits, and eight 
are preferred to allow full data transfer. 
Can't you just break an ASCII word into 
two four-bit "nibbles," encode and send 
each, then decode, correct, and add them 
back up at the receiver? 

In theory, you can. But you'll encounter 
another HF error - fading. Fading causes 
the entire loss, or "erasure," of one or 
several words. And, if the receiver was 
expecting a LSB "nibble" when the fade 
started, but picked up a most significant bit 
(MSB) nibble when it ended, it would 
assemble an incorrect word. Because the 
receiver's definition of MSB and LSB is 
now out of sync, all subsequent words will 
be reassembled incorrectly. 

In manual systems like straight RTTY, 
you could treat this problem the same way 
you'd treat a FIGS/LTRS garble. You'd use 
a key to direct the computer to shift the 
order in which it's reassembling the nibbles. 

You could also devote one of the four bits 
as a flag, indicating whether it's an MSB or 
LSB nibble. This would leave six bits - 
enough to encode a 64-character alphabet 
like Baudot. 
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Figure 1. Hamming application lo AX.25. 



Packet applications 

A third possibility would be to place the 
Hamming codes inside an error-detecting 
block code. You could send a block of a 
fixed number of characters (say 127), com- 
pute a checksum of each character, and then 
send the checksum. The receiver would per- 
form a similar process, acknowledging the 
text if it agrees with the checksum, or ask- 
ing for a repeat if it's incorrect. This com- 
mon algorithm for data transfer is used by 
XMODEM for landline and in the link pro- 
tocol in AX.25 packet. 

The AX.25 packet protocol is organized 
in "layers." The link layer organizes a block, 
computes checksums, determines if a block 
is received correctly, and handles repeat 
requests. The bottom-most layer is the phys- 
ical layer. This layer is normally concerned 
only with modulation and demodulation. 
It's at the physical layer that you'd intercept 
a single eight-bit word on its way to the 
modulator, encode it, and reverse the pro- 
cess at the receiver. Figure 1 shows how you 
could include Hamming encoding and 
decoding at this level without disturbing any 
of the other layers, except, perhaps, to allow 
more time for the longer Hamming codes. 
What do you gain by doing this? 

The AX.25 protocol already accounts for 
packets of incorrect length. Thus, erasures, 
and the framing problems they may generate, 
can be detected and handled by requests for 
retransmission generated by the existing link 
protocol. And, because the checksum is also 
encoded as a Hamming sequence, errors 
which can't be corrected will probably be 
detected as well, resulting in a retransmis- 
sion request. You can do all of this without 
touching the higher-order packet protocols. 
In all probability, this scheme will correct 
all single-bit errors per Hamming sequence, 
and detect all erasures (incorrect packet length) 
and uncorrectable errors (bad checksum). 
How high is this probability? Let's see. 

Number crunching 

The Bit Error Rate (BER) is the probability 
that any bit will be changed. The basic 
packet word is eight bits long, and each bit 
must be correct. The probability of receiv- 
ing an entirely correct eight-bit word is: 

P(8 correct bits) = (1 - BER) 8 (1) 

Packets come in various lengths; 128 words 
is representative. The last word is a check- 
sum, allowing errors in the block to be 
detected. All 128 words must be received 
correctly. The probability that 128 correct 
eight-bit words, or one entire packet of 
representative length, will be received is: 
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Pfcorrect packet) = P(8 correct bits) 128 (2) 

The probability that a correct packet will 
be received is defined as the number of suc- 
cesses divided by the number of attempts. 
The inverse of this is the average number of 
attempts that must be made to get one good 
packet through: 

N(attempts) = l/P(correcl packets) (3) 

The results are plotted on the graph in 
Figure 2. They show that, without error cor- 
rection, the number of attempts remains at 
essentially one transmission per packet up 
to about 0.0001 BER (1 bit per 10,000 
altered). The number of attempts then rises 
quickly to an average of two transmissions 
at a BER of 0.0005 and three at 0.001. 
Beyond that level, the number of attempts 
required to successfully get a packet through 
become astronomical. 

By comparison, a Hamming sequence 
will be accepted if it has either no errors, or 
a single-bit error. Because a Hamming 
sequence is seven bits long, the no-error 
probability is: 

P(7 correct bits) = (1 - BER) 7 (4) 

and the probability of exactly one error is: 



Pfexactly 1 bit error in 7) = 
7*BER*{1 - BER) 6 



(5) 



Thus, the probability of no errors, or exactly 
one error, is the sum of Equations 4 and 5: 

P(0 or 1 error in 7) - P(7 correct bits) + 
Pfexactly 1 bit error in 7) (6) 

Because you can only encode four bits 
onto a seven-bit Hamming sequence, you 
need 256 Hamming words with one or zero 
errors each, to convey 128 words with no 
errors. The probability of this occurring is: 



P(256 correctable sequences) = 
P{] or errors in 7)256 



(7) 



And, like the eight-bit word, the number 
of attempts required is: 

N(Hamming attempts) = 
1/P(256 correctable sequences) (8) 

This is also plotted in Figure 2. You can 
see thai Hamming sequences require no 
retransmit tal until there's a BER of 0.005 - 
nearly twice the BER that will bring an 
uncorrected link to its knees! A Hamming 
encoded link can maintain a useful through- 



put with less than two attempts per packet 
until it reaches a BER of 0.02 - nearly 20 
times higher than the link without error cor- 
rection. 

Of course, this doesn't come without cost. 
You may have noticed that Hamming 
sequences are 7/4 longer; that is, they are 
nearly twice as long as the unprotected 
packet. Does this overhead pay for itself? 

Yes. The number of bits per packet is the 
basic number of bits per individual packet 
times the average number of attempts: 



TXBITS(Hamming) = 
7*256*N(Hamming attempts) (9) 



and 



TXBlTS(Normal) = 8*128*N(attempts)(10) 

These arc plotted in Figure 3. Note that 
for low error rates, the uncorrected link 
without Hamming codes outperforms the 
Hamming link by almost 2:1, requiring only 
1024 bits compared with 1792 Hamming 
bits. The link errors are too few to justify 
the high overhead of the Hamming bits. At 
about 0.0005 BER, the two are equal in per- 
formance. On the average, the Hamming 
link will require less than two transmissions 
for BERs up to 0.01. 

This analysis doesn't include the possibil- 
ity of using the more robust, but slower, 
Hamming codes to support HF data rates 
which could go as high as 1200 baud — 
unthinkably fast for conventional HF packet. 

Application to AMTOR 

What about AMTOR? AMTOR uses a 
seven-bit alphabet, with just four Is and 
three 0s. A total of 35 characters can be 
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Figure 3. Average transmitted tails per 128-byle packet. 
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Figure 4. Attempts versus BER, AMTOR 3 bytes x 7 bit group. 
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encoded. Characters are sent in groups of 
three. Any character which doesn't confirm 
to this 4/3 sequence is detected as an error, 
and generates an RQ (Repeat Request)? 

The analysis is basically the same as it is 
for packet. The exception is that the basic 
block is three seven-byte words, which can 
be encoded onto six seven-byte Hamming 
sequences. Figures 4 and 5 show the results. 
Surprisingly, Hamming sequences have little 
advantage over short AMTOR sequences. 
Uncorrected AMTOR more than holds its 
own until enormously high BERs of 0.05 
are reached. Only in the presence of abso- 
lutely incredible noise levels of 0.1, does 
AMTOR Hamming gain a 2:1 advantage 
over straight AMTOR. These results 
shouldn't come as a great surprise to 
AMTOR enthusiasts. The key lies in the 
very short AMTOR block lengths. However, 
I haven't taken the effects of false 
acknowledgement into consideration here - 
specifically the misinterpretation of an "ACK" 
(Acknowledgement) as art "RQ" (Repeat 
Request)? 

Frankly, AMTOR doesn't appear to be 
much improved by Hamming sequences. 
Using a higher data rate wouldn't change 
this situation much because AMTOR 
spends a significant amount of time waiting 
for transmitters to change over. Shortening 
the data transmission time wouldn't appear 
to reduce the overall time significantly. 
AMTOR efficiency might improve if a 
longer sequence with Hamming codes is 
used, but that would involve a major change 
to the AMTOR protocol. 

Technical details of 
Hamming codes 

How do Hamming codes work? To 
understand how they work, you must first 
understand the concept of Hamming dis- 
tance. Hamming distance is the number of 
bits that would have to be changed to trans- 
form one binary word into another. For 
example: 0011, 0101, 1001, and 0000 are all 
within one Hamming distance of 0001. By 
contrast, 1110 is separated by four bits dis- 
tance from 0001, and all four bits would 
have to be altered to change I (01H) to 14 
(0EH). Hamming distance can be computed 
by "exclusive OR'ing" the two binary words 
together and counting the Is. 

Hamming sequences which can correct 
single-bit errors use an "alphabet" in which 
all the legal sequences that can be transmit- 
ted have a Hamming distance of three from 
each other. For example, Table i f the encode 
table used in the text, has sixteen seven-bit 
sequences out of a possible 128, all of 
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which differ from each other by three or 
more bits. The other 112 possibilities repre- 
sent erroneous sequences caused by noise 
that creates a one-bit change in one of the 
sixteen legal sequences. Because the legal 
sequences are separated from each other by 
three bits, these erroneous sequences will be 
separated by two or more bits from all other 
legal sequences - except for the one which 
was actually sent. Using the example in the 
text, if the sequence encoding 04H is altered 
at the LSB to 100 1101, this sequence is still 
at least two bits removed from any other 
Table 1 sequence, except 100 1100. Thus, 
you can assume that an erroneous sequence 
should actually be the legal sequence 
"closest" to it. 

Two-bit errors still won't produce a legal 
sequence. They won't decode correctly, either. 
Hamming codes of distance three can't dis- 
tinguish between a correctable single-bit 
error and an uncorrectable two-bit error. 

Generating the Hamming 
sequence 

Hamming sequences use parity bits based 
on the message word to be encoded. The 
parity bits are defined so they will actually 
point to zero (the error- free condition), or a 
binary number representing the bit position 
in error. Because of this, bits in a Hamming 
sequence are numbered from 1 to N, rather 
than the binary starting point of zero. Three 
parity bits are needed to handle a seven-bit 
sequence, leaving four bits available as mes- 
sage bits. 

Table 3 shows how parity bits are defined 
for 7/4, 15/11 (four parity bits), and 31/26 
{five parity bits) sequences. Parity bits are 
assigned to locations corresponding to 
integer powers of two within the sequence 
(bits PI, P2, P4, P8, and PI 6), and message 
bits to all others. A "1" at the intersection 
of a message bit row and parity bit column, 
means that the message bit should be 
included when determining the correspond- 
ing parity bit. Following the example in the 
text, "Is" appear opposite M3, M5, and M7, 
under parity bit PI. These bits are XOR'd 
together to form the parity bit 1. P2 is the 
XOR of message bits M3, M6, and M7, and 
P4 is the XOR of M5, M6, and M7. 

Hamming sequences were originally 
intended for hardware generation and detec- 
tion? Figure 6A shows how the 7/4 code 
above can be created in hardware for the 
transmitter. The parity bits are woven into 
the sequence as shown. This figure illus- 
trates the generation of the 1001100 
sequence for 04H, given in the text. 

The receiver in Figure 6B generates the 
same parity bits - PI, P2, and P4 - and 



XOR's each with its corresponding received 
parity. The resulting three- bit word is called 
the "syndrome," and points to the binary 
position of the bit in error. As shown, the 
receiver copies 1001 101, generating a syndrome 
of 7. The syndrome is applied to a l-of-8 
decoder. Zero is the "no errors detected" 
condition; 1, 2, and 4 indicate that the par- 
ity bits themselves were in error and aren't 
needed to correct the message bits. Finally, 
3, 5, 6, and 7 are XOR'd with their cor- 
responding message bits. Bit 7 in the exam- 
ple is XOR'd by the decoded syndrome, 
back to its correct value of 0. 

The Hamming codes were developed when 
such hardware solutions were essential for 
implementing the technique. Solutions of 
this type are still necessary for longer 
sequences where the decode tables can 
become prohibitively large. However, as 
noted in the text, look-up table schemes in 
software are now a more efficient implemen- 
tation for short sequences like the 7/4 code? 
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Table 3. General scheme for determining parity tor 7/4, 15/1 1. and 31/26 Hamming sequences. 
A "1" indicales thai the corresponding message bit should be included in the XO& free for that 
particular parity bit. 
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M7 WITH 1 ERROR 
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(0) DO 




SYNDROME 



1 OF s 
DECODER 




D3 



■ INC) 

■ (NO 



' NO ERRORS DETECTED 



Hamming codes assume that two-bit 
errors within a word are much less likely to 
appear than single-bit errors. This isn't 
strictly true. Many errors occur in bursts, 
causing multiple errors and even complete 
loss, or erasure, of entire the word. 
Nevertheless, Hamming codes can easily 
correct most errors in communications. 

Summary 

The main limitation to the more wide- 
spread implementation of this simple 
scheme is the lack of an accepted protocol. 
The techniques I've described here are easily 
implemented on any computer/TU system 
capable of straight ASCII operation. The 
integration of this technique into the AX.25 
packet protocol may be a bit more compli- 
cated, but it would improve HF packet 
throughput significantly. In fact, Hamming 
code applications could improve this form 
of packet to such a dramatic extent, that 
some serious research may be in order. 

I invite all who wish to experiment with 
the development of a new protocol based on 
the Hamming technique to contact me, 
either by mail or at my packet address, 
KB61C @ K0BOY. ■ 
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Figure 6B. Hardware decoding of a 7/4 receiver. indicates bit in error and corrections. 
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